Abstract: .Hardware realization of a Neural Network (NN), to a large extent depends on the efficient implementation of a single neuron. FPGA-based reconfigurable computing architectures are suitable for hardware implementation of neural networks. FPGA realization of ANNs with a large number of neurons is still a challenging task. This paper discusses the issues involved in implementation of a multi-input neuron with linear/nonlinear excitation functions using FPGA. Implementation method with resource/speed tradeoff is proposed to handle signed decimal numbers. The VHDL coding developed is tested using Xilinx XC V50hq240 Chip. To improve the speed of operation a lookup table method is used. The problems involved in using a lookup table (LUT) for a nonlinear function is discussed. The percentage saving in resource and the improvement in speed with an LUT for a neuron is reported. An attempt is also made to derive a generalized formula for a multi-input neuron that facilitates to estimate approximately the total resource requirement and speed achievable for a given multilayer neural network. This facilitates the designer to choose the FPGA capacity for a given application. Using the proposed method of implementation a neural network based application, namely, a Space vector modulator for a vector-controlled drive is presented
Abstract: In this paper, we propose a Connect6 solver which
adopts a hybrid approach based on a tree-search algorithm and image
processing techniques. The solver must deal with the complicated
computation and provide high performance in order to make real-time
decisions. The proposed approach enables the solver to be
implemented on a single Spartan-6 XC6SLX45 FPGA produced by
XILINX without using any external devices. The compact
implementation is achieved through image processing techniques to
optimize a tree-search algorithm of the Connect6 game. The tree
search is widely used in computer games and the optimal search brings
the best move in every turn of a computer game. Thus, many
tree-search algorithms such as Minimax algorithm and artificial
intelligence approaches have been widely proposed in this field.
However, there is one fundamental problem in this area; the
computation time increases rapidly in response to the growth of the
game tree. It means the larger the game tree is, the bigger the circuit
size is because of their highly parallel computation characteristics.
Here, this paper aims to reduce the size of a Connect6 game tree using
image processing techniques and its position symmetric property. The
proposed solver is composed of four computational modules: a
two-dimensional checkmate strategy checker, a template matching
module, a skilful-line predictor, and a next-move selector. These
modules work well together in selecting next moves from some
candidates and the total amount of their circuits is small. The details of
the hardware design for an FPGA implementation are described and
the performance of this design is also shown in this paper.
Abstract: This paper presents an adaptive motion estimator
that can be dynamically reconfigured by the best algorithm
depending on the variation of the video nature during the lifetime
of an application under running. The 4 Step Search (4SS) and the
Gradient Search (GS) algorithms are integrated in the estimator in
order to be used in the case of rapid and slow video sequences
respectively. The Full Search Block Matching (FSBM) algorithm
has been also integrated in order to be used in the case of the
video sequences which are not real time oriented.
In order to efficiently reduce the computational cost while
achieving better visual quality with low cost power, the proposed
motion estimator is based on a Variable Block Size (VBS) scheme
that uses only the 16x16, 16x8, 8x16 and 8x8 modes.
Experimental results show that the adaptive motion estimator
allows better results in term of Peak Signal to Noise Ratio
(PSNR), computational cost, FPGA occupied area, and dissipated
power relatively to the most popular variable block size schemes
presented in the literature.
Abstract: Design and implementation of a novel B-ACOSD CFAR algorithm is presented in this paper. It is proposed for detecting radar target in log-normal distribution environment. The BACOSD detector is capable to detect automatically the number interference target in the reference cells and detect the real target by an adaptive threshold. The detector is implemented as a System on Chip on FPGA Altera Stratix II using parallelism and pipelining technique. For a reference window of length 16 cells, the experimental results showed that the processor works properly with a processing speed up to 115.13MHz and processing time0.29 ┬Ás, thus meets real-time requirement for a typical radar system.
Abstract: In Blind Source Separation (BSS) processing, taking
advantage of scaling factor indetermination and based on the floatingpoint
representation, we propose a scaling technique applied to the
separation matrix, to avoid the saturation or the weakness in the
recovered source signals. This technique performs an Automatic Gain
Control (AGC) in an on-line BSS environment. We demonstrate
the effectiveness of this technique by using the implementation of
a division free BSS algorithm with two input, two output. This
technique is computationally cheaper and efficient for a hardware
implementation.
Abstract: This paper presents the hardware design of a unified
architecture to compute the 4x4, 8x8 and 16x16 efficient twodimensional
(2-D) transform for the HEVC standard. This
architecture is based on fast integer transform algorithms. It is
designed only with adders and shifts in order to reduce the hardware
cost significantly. The goal is to ensure the maximum circuit reuse
during the computing while saving 40% for the number of operations.
The architecture is developed using FIFOs to compute the second
dimension. The proposed hardware was implemented in VHDL. The
VHDL RTL code works at 240 MHZ in an Altera Stratix III FPGA.
The number of cycles in this architecture varies from 33 in 4-point-
2D-DCT to 172 when the 16-point-2D-DCT is computed. Results
show frequency improvements reaching 96% when compared to an
architecture described as the direct transcription of the algorithm.
Abstract: In this paper an efficient implementation of Ripemd-
160 hash function is presented. Hash functions are a special family
of cryptographic algorithms, which is used in technological
applications with requirements for security, confidentiality and
validity. Applications like PKI, IPSec, DSA, MAC-s incorporate
hash functions and are used widely today. The Ripemd-160 is
emanated from the necessity for existence of very strong algorithms
in cryptanalysis. The proposed hardware implementation can be
synthesized easily for a variety of FPGA and ASIC technologies.
Simulation results, using commercial tools, verified the efficiency of
the implementation in terms of performance and throughput. Special
care has been taken so that the proposed implementation doesn-t
introduce extra design complexity; while in parallel functionality was
kept to the required levels.
Abstract: This paper presents software tools that convert the C/Cµ floating point source code for a DSP algorithm into a fixedpoint simulation model that can be used to evaluate the numericalperformance of the algorithm on several different fixed pointplatforms including microprocessors, DSPs and FPGAs. The tools use a novel system for maintaining binary point informationso that the conversion from floating point to fixed point isautomated and the resulting fixed point algorithm achieves maximum possible precision. A configurable architecture is used during the simulation phase so that the algorithm can produce a bit-exact output for several different target devices.
Abstract: In this paper a novel, simple and reliable digital firing
scheme has been implemented for speed control of three-phase
induction motor using ac voltage controller. The system consists of
three-phase supply connected to the three-phase induction motor via
three triacs and its control circuit. The ac voltage controller has three
modes of operation depending on the shape of supply current. The
performance of the induction motor differs in each mode where the
speed is directly proportional with firing angle in two modes and
inversely in the third one. So, the control system has to detect the
current mode of operation to choose the correct firing angle of triacs.
Three sensors are used to feed the line currents to control system to
detect the mode of operation. The control strategy is implemented
using a low cost Xilinx Spartan-3E field programmable gate array
(FPGA) device. Three PI-controllers are designed on FPGA to
control the system in the three-modes. Simulation of the system is
carried out using PSIM computer program. The simulation results
show stable operation for different loading conditions especially in
mode 2/3. The simulation results have been compared with the
experimental results from laboratory prototype.
Abstract: The “PYRAMIDS" Block Cipher is a symmetric encryption algorithm of a 64, 128, 256-bit length, that accepts a variable key length of 128, 192, 256 bits. The algorithm is an iterated cipher consisting of repeated applications of a simple round transformation with different operations and different sequence in each round. The algorithm was previously software implemented in Cµ code. In this paper, a hardware implementation of the algorithm, using Field Programmable Gate Arrays (FPGA), is presented. In this work, we discuss the algorithm, the implemented micro-architecture, and the simulation and implementation results. Moreover, we present a detailed comparison with other implemented standard algorithms. In addition, we include the floor plan as well as the circuit diagrams of the various micro-architecture modules.
Abstract: Designing and implementing intelligent systems has become a crucial factor for the innovation and development of better products of space technologies. A neural network is a parallel system, capable of resolving paradigms that linear computing cannot. Field programmable gate array (FPGA) is a digital device that owns reprogrammable properties and robust flexibility. For the neural network based instrument prototype in real time application, conventional specific VLSI neural chip design suffers the limitation in time and cost. With low precision artificial neural network design, FPGAs have higher speed and smaller size for real time application than the VLSI and DSP chips. So, many researchers have made great efforts on the realization of neural network (NN) using FPGA technique. In this paper, an introduction of ANN and FPGA technique are briefly shown. Also, Hardware Description Language (VHDL) code has been proposed to implement ANNs as well as to present simulation results with floating point arithmetic. Synthesis results for ANN controller are developed using Precision RTL. Proposed VHDL implementation creates a flexible, fast method and high degree of parallelism for implementing ANN. The implementation of multi-layer NN using lookup table LUT reduces the resource utilization for implementation and time for execution.
Abstract: An approach to develop the FPGA of a flexible key
RSA encryption engine that can be used as a standard device in the
secured communication system is presented. The VHDL modeling of
this RSA encryption engine has the unique characteristics of
supporting multiple key sizes, thus can easily be fit into the systems
that require different levels of security. A simple nested loop addition
and subtraction have been used in order to implement the RSA
operation. This has made the processing time faster and used
comparatively smaller amount of space in the FPGA. The hardware
design is targeted on Altera STRATIX II device and determined that
the flexible key RSA encryption engine can be best suited in the
device named EP2S30F484C3. The RSA encryption implementation
has made use of 13,779 units of logic elements and achieved a clock
frequency of 17.77MHz. It has been verified that this RSA
encryption engine can perform 32-bit, 256-bit and 1024-bit
encryption operation in less than 41.585us, 531.515us and 790.61us
respectively.
Abstract: This paper presents implementation of attitude controller for a small UAV using field programmable gate array (FPGA). Due to the small size constrain a miniature more compact and computationally extensive; autopilot platform is needed for such systems. More over UAV autopilot has to deal with extremely adverse situations in the shortest possible time, while accomplishing its mission. FPGAs in the recent past have rendered themselves as fast, parallel, real time, processing devices in a compact size. This work utilizes this fact and implements different attitude controllers for a small UAV in FPGA, using its parallel processing capabilities. Attitude controller is designed in MATLAB/Simulink environment. The discrete version of this controller is implemented using pipelining followed by retiming, to reduce the critical path and thereby clock period of the controller datapath. Pipelined, retimed, parallel PID controller implementation is done using rapidprototyping and testing efficient development tool of “system generator", which has been developed by Xilinx for FPGA implementation. The improved timing performance enables the controller to react abruptly to any changes made to the attitudes of UAV.
Abstract: In this paper, a new method of controlling position of AC Servomotor using Field Programmable Gate Array (FPGA). FPGA controller is used to generate direction and the number of pulses required to rotate for a given angle. Pulses are sent as a square wave, the number of pulses determines the angle of rotation and frequency of square wave determines the speed of rotation. The proposed control scheme has been realized using XILINX FPGA SPARTAN XC3S400 and tested using MUMA012PIS model Alternating Current (AC) servomotor. Experimental results show that the position of the AC Servo motor can be controlled effectively. KeywordsAlternating Current (AC), Field Programmable Gate Array (FPGA), Liquid Crystal Display (LCD).
Abstract: This paper describes about dynamic reconfiguration to
miniaturize arithmetic circuits in general-purpose processor. Dynamic
reconfiguration is a technique to realize required functions by
changing hardware construction during operation. The proposed
arithmetic circuit performs floating-point arithmetic which is
frequently used in science and technology. The data format is
floating-point based on IEEE754. The proposed circuit is designed
using VHDL, and verified the correct operation by simulations and
experiments.
Abstract: In this paper the FPGA implementations for four
stream ciphers are presented. The two stream ciphers, MUGI and
SNOW 2.0 are recently adopted by the International Organization for
Standardization ISO/IEC 18033-4:2005 standard. The other two
stream ciphers, MICKEY 128 and TRIVIUM have been submitted
and are under consideration for the eSTREAM, the ECRYPT
(European Network of Excellence for Cryptology) Stream Cipher
project. All ciphers were coded using VHDL language. For the
hardware implementation, an FPGA device was used. The proposed
implementations achieve throughputs range from 166 Mbps for
MICKEY 128 to 6080 Mbps for MUGI.
Abstract: Optical flow is a research topic of interest for many
years. It has, until recently, been largely inapplicable to real-time
applications due to its computationally expensive nature. This paper
presents a new reliable flow technique which is combined with a
motion detection algorithm, from stationary camera image streams,
to allow flow-based analyses of moving entities, such as rigidity, in
real-time. The combination of the optical flow analysis with motion
detection technique greatly reduces the expensive computation of
flow vectors as compared with standard approaches, rendering the
method to be applicable in real-time implementation. This paper
describes also the hardware implementation of a proposed pipelined
system to estimate the flow vectors from image sequences in real
time. This design can process 768 x 576 images at a very high frame
rate that reaches to 156 fps in a single low cost FPGA chip, which is
adequate for most real-time vision applications.
Abstract: In this paper, we study FPGA implementation of a
novel supra-optimal receiver diversity combining technique,
generalized maximal ratio combining (GMRC), for wireless
transmission over fading channels in SIMO systems. Prior
published results using ML-detected GMRC diversity signal
driven by BPSK showed superior bit error rate performance to
the widely used MRC combining scheme in an imperfect
channel estimation (ICE) environment. Under perfect channel
estimation conditions, the performance of GMRC and MRC
were identical. The main drawback of the GMRC study was
that it was theoretical, thus successful FPGA implementation
of it using pipeline techniques is needed as a wireless
communication test-bed for practical real-life situations.
Simulation results showed that the hardware implementation
was efficient both in terms of speed and area. Since diversity
combining is especially effective in small femto- and picocells,
internet-associated wireless peripheral systems are to
benefit most from GMRC. As a result, many spinoff
applications can be made to the hardware of IP-based 4th
generation networks.
Abstract: Implemented 5-bit 125-MS/s successive
approximation register (SAR) analog to digital converter (ADC) on
FPGA is presented in this paper.The design and modeling of a high
performance SAR analog to digital converter are based on monotonic
capacitor switching procedure algorithm .Spartan 3 FPGA is chosen
for implementing SAR analog to digital converter algorithm. SAR
VHDL program writes in Xilinx and modelsim uses for showing
results.
Abstract: When a small H/W IP is designed, we can develop an
appropriate verification environment by observing the simulated
signal waves, or using the serial test vectors for the fixed output. In the
case of design and verification of a massive parallel processor with
multiple IPs, it-s difficult to make a verification system with existing
common verification environment, and to verify each partial IP. A
TestDrive verification environment can build easy and reliable
verification system that can produce highly intuitive results by
applying Modelsim and SystemVerilog-s DPI. It shows many
advantages, for example a high-level design of a GPGPU processor
design can be migrate to FPGA board immediately.