Abstract: Modern applications realized onto FPGAs exhibit high connectivity demands. Throughout this paper we study the routing constraints of Virtex devices and we propose a systematic methodology for designing a novel general-purpose interconnection network targeting to reconfigurable architectures. This network consists of multiple segment wires and SB patterns, appropriately selected and assigned across the device. The goal of our proposed methodology is to maximize the hardware utilization of fabricated routing resources. The derived interconnection scheme is integrated on a Virtex style FPGA. This device is characterized both for its high-performance, as well as for its low-energy requirements. Due to this, the design criterion that guides our architecture selections was the minimal Energy×Delay Product (EDP). The methodology is fully-supported by three new software tools, which belong to MEANDER Design Framework. Using a typical set of MCNC benchmarks, extensive comparison study in terms of several critical parameters proves the effectiveness of the derived interconnection network. More specifically, we achieve average Energy×Delay Product reduction by 63%, performance increase by 26%, reduction in leakage power by 21%, reduction in total energy consumption by 11%, at the expense of increase of channel width by 20%.
Abstract: In this paper, low end Digital Signal Processors (DSPs)
are applied to accelerate integer neural networks. The use of DSPs
to accelerate neural networks has been a topic of study for some
time, and has demonstrated significant performance improvements.
Recently, work has been done on integer only neural networks, which
greatly reduces hardware requirements, and thus allows for cheaper
hardware implementation. DSPs with Arithmetic Logic Units (ALUs)
that support floating or fixed point arithmetic are generally more
expensive than their integer only counterparts due to increased circuit
complexity. However if the need for floating or fixed point math
operation can be removed, then simpler, lower cost DSPs can be
used. To achieve this, an integer only neural network is created in
this paper, which is then accelerated by using DSP instructions to
improve performance.
Abstract: The proper assessment of interaxial distance and
convergence control are important factors in stereoscopic imaging
technology to make an efficient 3D image. To control interaxial
distance and convergence for efficient 3D shooting, horizontal 3D
camera rig is designed using some hardware components like 'LM
Guide', 'Goniometer' and 'Rotation Stage'. The horizontal 3D camera
rig system can be properly aligned by moving the two cameras
horizontally in same or opposite directions, by adjusting the camera
angle and finally considering horizontal swing as well as vertical
swing. In this paper, the relationship between interaxial distance and
convergence angle control are discussed and intensive experiments are
performed in order to demonstrate an easy and effective 3D shooting.
Abstract: Tablet computers and Multifunctional Hardcopy Devices (MHDs) are common devices in daily life. Though, many scientific studies have not been published. The tablet computers are straightforward to use whereas the MHDs are comparatively difficult to use. Thus, to assist different levels of users, we propose combining these two devices to achieve straightforward intelligent user interface (UI) and versatile What You See Is What You Get (WYSIWYG) document management and production. Our approach to this issue is to design an intelligent user dependent UI for a MHD applying a tablet computer. Furthermore, we propose hardware interconnection and versatile intelligent software between these two devices. In this study, we first provide a state-of-the-art survey on MHDs and tablet computers, and their interconnections. Secondly we provide a comparative UI survey on two state-of-the-art MHDs with a proposal of a novel UI for the MHDs using Jakob Nielsen-s Ten Usability Heuristics Evaluation.
Abstract: In today-s highly globalised and competitive world
access to information plays key role in having an upper hand between
business rivals. Hence, proper protection of such crucial resource is
core to any modern business. Implementing a successful information
security system is basically centered around three pillars; technical
solution involving both software and hardware, information security
controls to translate the policies and procedure in the system and the
people to implement. This paper shows that a lot needs to be done for
countries adapting information technology to process, store and
distribute information to secure adequately such core resource.
Abstract: Streaming Applications usually run in parallel or in
series that incrementally transform a stream of input data. It poses a
design challenge to break such an application into distinguishable
blocks and then to map them into independent hardware processing
elements. For this, there is required a generic controller that
automatically maps such a stream of data into independent processing
elements without any dependencies and manual considerations. In
this paper, Kahn Process Networks (KPN) for such streaming
applications is designed and developed that will be mapped on
MPSoC. This is designed in such a way that there is a generic Cbased
compiler that will take the mapping specifications as an input
from the user and then it will automate these design constraints and
automatically generate the synthesized RTL optimized code for
specified application.
Abstract: Various mechanisms providing mutual exclusion and
thread synchronization can be used to support parallel processing
within a single computer. Instead of using locks, semaphores, barriers
or other traditional approaches in this paper we focus on alternative
ways for making better use of modern multithreaded architectures
and preparing hash tables for concurrent accesses. Hash structures
will be used to demonstrate and compare two entirely different
approaches (rule based cooperation and hardware synchronization
support) to an efficient parallel implementation using traditional
locks. Comparison includes implementation details, performance
ranking and scalability issues. We aim at understanding the effects
the parallelization schemes have on the execution environment with
special focus on the memory system and memory access
characteristics.
Abstract: In this paper, 3X3 routing nodes are proposed to
provide speedup and parallel processing capability in Data Vortex
network architectures. The new design not only significantly
improves network throughput and latency, but also eliminates the
need for distributive traffic control mechanism originally embedded
among nodes and the need for nodal buffering. The cost effectiveness
is studied by a comparison study with the previously proposed 2-
input buffered networks, and considerable performance enhancement
can be achieved with similar or lower cost of hardware. Unlike
previous implementation, the network leaves small probability of
contention, therefore, the packet drop rate must be kept low for such
implementation to be feasible and attractive, and it can be achieved
with proper choice of operation conditions.
Abstract: Today-s Voltage Regulator Modules (VRMs) face increasing design challenges as the number of transistors in microprocessors increases per Moore-s Law. These challenges have recently become even more demanding as microprocessors operate at sub voltage range at significantly high current. This paper presents a new multiphase topology with cell configuration for improved performance in low voltage and high current applications. A lab scale hardware prototype of the new topology was design and constructed. Laboratory tests were performed on the proposed converter and compared with a commercially available VRM. Results from the proposed topology exhibit improved performance compared to the commercially available counterpart.
Abstract: An on-line condition monitoring method for transmission line is proposed using electrical circuit theory and IT technology in this paper. It is reasonable that the circuit parameters such as resistance (R), inductance (L), conductance (g) and capacitance (C) of a transmission line expose the electrical conditions and physical state of the line. Those parameters can be calculated from the linear equation composed of voltages and currents measured by synchro-phasor measurement technique at both end of the line. A set of linear voltage drop equations containing four terminal constants (A, B ,C ,D ) are mathematical models of the transmission line circuits. At least two sets of those linear equations are established from different operation condition of the line, they may mathematically yield those circuit parameters of the line. The conditions of line connectivity including state of connecting parts or contacting parts of the switching device may be monitored by resistance variations during operation. The insulation conditions of the line can be monitored by conductance (g) and capacitance(C) measurements. Together with other condition monitoring devices such as partial discharge, sensors and visual sensing device etc.,they may give useful information to monitor out any incipient symptoms of faults. The prototype of hardware system has been developed and tested through laboratory level simulated transmission lines. The test has shown enough evident to put the proposed method to practical uses.
Abstract: A new and highly efficient architecture for elliptic curve scalar point multiplication which is optimized for a binary field recommended by NIST and is well-suited for elliptic curve cryptographic (ECC) applications is presented. To achieve the maximum architectural and timing improvements we have reorganized and reordered the critical path of the Lopez-Dahab scalar point multiplication architecture such that logic structures are implemented in parallel and operations in the critical path are diverted to noncritical paths. With G=41, the proposed design is capable of performing a field multiplication over the extension field with degree 163 in 11.92 s with the maximum achievable frequency of 251 MHz on Xilinx Virtex-4 (XC4VLX200) while 22% of the chip area is occupied, where G is the digit size of the underlying digit-serial finite field multiplier.
Abstract: A new genetic algorithm, termed the 'optimum individual monogenetic genetic algorithm' (OIMGA), is presented whose properties have been deliberately designed to be well suited to hardware implementation. Specific design criteria were to ensure fast access to the individuals in the population, to keep the required silicon area for hardware implementation to a minimum and to incorporate flexibility in the structure for the targeting of a range of applications. The first two criteria are met by retaining only the current optimum individual, thereby guaranteeing a small memory requirement that can easily be stored in fast on-chip memory. Also, OIMGA can be easily reconfigured to allow the investigation of problems that normally warrant either large GA populations or individuals many genes in length. Local convergence is achieved in OIMGA by retaining elite individuals, while population diversity is ensured by continually searching for the best individuals in fresh regions of the search space. The results given in this paper demonstrate that both the performance of OIMGA and its convergence time are superior to those of a range of existing hardware GA implementations.
Abstract: This paper proposes a novel frequency offset (FO) estimator for orthogonal frequency division multiplexing. Simplicity is most significant feature of this algorithm and can be repeated to achieve acceptable accuracy. Also fractional and integer part of FO is estimated jointly with use of the same algorithm. To do so, instead of using conventional algorithms that usually use correlation function, we use DFT of received signal. Therefore, complexity will be reduced and we can do synchronization procedure by the same hardware that is used to demodulate OFDM symbol. Finally, computer simulation shows that the accuracy of this method is better than other conventional methods.
Abstract: This paper presents a low cost automatic system for
sampling the electric field in a limited area. The scanning area is a
flat surface parallel to the ground at a selected height. We discuss
in detail the hardware, software and all the arrangements involved
in the system operation. In order to show the system performance
we include a campaign of narrow band measurements with 6017
sample points in the surroundings of a cellular base station. A
commercial isotropic antenna with three orthogonal axes was used
as sampling device. The results are analyzed in terms of its space
average, standard deviation and statistical distribution.
Abstract: A multi-board run-time reconfigurable (MRTR)
system for evolvable hardware (EHW) is introduced with the aim to
implement on hardware the bidirectional incremental evolution (BIE)
method. The main features of this digital intrinsic EHW solution rely
on the multi-board approach, the variable chromosome length
management and the partial configuration of the reconfigurable
circuit. These three features provide a high scalability to the solution.
The design has been written in VHDL with the concern of not being
platform dependant in order to keep a flexibility factor as high as
possible. This solution helps tackling the problem of evolving
complex task on digital configurable support.
Abstract: Vision-based intelligent vehicle applications often require large amounts of memory to handle video streaming and image processing, which in turn increases complexity of hardware and software. This paper presents an FPGA implement of a vision-based blind spot warning system. Using video frames, the information of the blind spot area turns into one-dimensional information. Analysis of the estimated entropy of image allows the detection of an object in time. This idea has been implemented in the XtremeDSP video starter kit. The blind spot warning system uses only 13% of its logic resources and 95k bits block memory, and its frame rate is over 30 frames per sec (fps).
Abstract: Quality of 2D and 3D cross-sectional images produce
by Computed Tomography primarily depend upon the degree of
precision of primary and secondary X-Ray intensity detection.
Traditional method of primary intensity detection is apt to errors.
Recently the X-Ray intensity measurement system along with smart
X-Ray sensors is developed by our group which is able to detect
primary X-Ray intensity unerringly. In this study a new smart X-Ray
sensor is developed using Light-to-Frequency converter TSL230
from Texas Instruments which has numerous advantages in terms of
noiseless data acquisition and transmission. TSL230 construction is
based on a silicon photodiode which converts incoming X-Ray
radiation into the proportional current signal. A current to frequency
converter is attached to this photodiode on a single monolithic CMOS
integrated circuit which provides proportional frequency count to
incoming current signal in the form of the pulse train. The frequency
count is delivered to the center of PICDEM FS USB board with
PIC18F4550 microcontroller mounted on it. With highly compact
electronic hardware, this Demo Board efficiently read the smart
sensor output data. The frequency output approaches overcome
nonlinear behavior of sensors with analog output thus un-attenuated
X-Ray intensities could be measured precisely and better
normalization could be acquired in order to attain high resolution.
Abstract: An efficient architecture for low jitter All Digital
Phase Locked Loop (ADPLL) suitable for high speed SoC
applications is presented in this paper. The ADPLL is designed using
standard cells and described by Hardware Description Language
(HDL). The ADPLL implemented in a 90 nm CMOS process can
operate from 10 to 200 MHz and achieve worst case frequency
acquisition in 14 reference clock cycles. The simulation result shows
that PLL has cycle to cycle jitter of 164 ps and period jitter of 100 ps
at 100MHz. Since the digitally controlled oscillator (DCO) can
achieve both high resolution and wide frequency range, it can meet
the demands of system-level integration. The proposed ADPLL can
easily be ported to different processes in a short time. Thus, it can
reduce the design time and design complexity of the ADPLL, making
it very suitable for System-on-Chip (SoC) applications.
Abstract: This paper presents a dynamic adaptation scheme for
the frequency of inter-deme migration in distributed genetic algorithms
(GA), and its VLSI hardware design. Distributed GA,
or multi-deme-based GA, uses multiple populations which evolve
concurrently. The purpose of dynamic adaptation is to improve
convergence performance so as to obtain better solutions. Through
simulation experiments, we proved that our scheme achieves better
performance than fixed frequency migration schemes.
Abstract: Proposal for a secure stream cipher based on Linear Feedback Shift Registers (LFSR) is presented here. In this method, shift register structure used for polynomial modular division is combined with LFSR keystream generator to yield a new keystream generator with much higher periodicity. Security is brought into this structure by using the Boolean function to combine state bits of the LFSR keystream generator and taking the output through the Boolean function. This introduces non-linearity and security into the structure in a way similar to the Non-linear filter generator. The security and throughput of the suggested stream cipher is found to be much greater than the known LFSR based structures for the same key length.